Overview

Dataset statistics

Number of variables13
Number of observations891
Missing cells14
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory78.7 KiB
Average record size in memory90.4 B

Variable types

Numeric5
Categorical6
Text2

Alerts

Age is highly overall correlated with Age_class and 1 other fieldsHigh correlation
Survived is highly overall correlated with SexHigh correlation
Sex is highly overall correlated with SurvivedHigh correlation
Age_class is highly overall correlated with Age and 1 other fieldsHigh correlation
Age_class2 is highly overall correlated with Age and 1 other fieldsHigh correlation
Age_class has 14 (1.6%) missing valuesMissing
PassengerId is uniformly distributedUniform
PassengerId has unique valuesUnique
Name has unique valuesUnique
SibSp has 608 (68.2%) zerosZeros
Parch has 678 (76.1%) zerosZeros
Fare has 15 (1.7%) zerosZeros

Reproduction

Analysis started2023-11-11 03:05:42.747341
Analysis finished2023-11-11 03:05:44.146300
Duration1.4 second
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:44.182191image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.35384
Coefficient of variation (CV)0.57702655
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2023-11-11T12:05:44.234284image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
599 1
 
0.1%
588 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
Other values (881) 881
98.9%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
889 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%

Survived
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Length

2023-11-11T12:05:44.276493image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:44.316601image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring characters

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Pclass
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Length

2023-11-11T12:05:44.351042image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:44.392548image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Name
Text

UNIQUE 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:44.484818image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length82
Median length52
Mean length26.965208
Min length12

Characters and Unicode

Total characters24026
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr 521
 
14.4%
miss 182
 
5.0%
mrs 129
 
3.6%
william 64
 
1.8%
john 44
 
1.2%
master 40
 
1.1%
henry 35
 
1.0%
george 24
 
0.7%
james 24
 
0.7%
charles 23
 
0.6%
Other values (1515) 2538
70.0%
2023-11-11T12:05:44.657413image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15446
64.3%
Uppercase Letter 3645
 
15.2%
Space Separator 2735
 
11.4%
Other Punctuation 1899
 
7.9%
Close Punctuation 144
 
0.6%
Open Punctuation 144
 
0.6%
Dash Punctuation 13
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1958
12.7%
e 1703
11.0%
a 1657
10.7%
i 1325
8.6%
n 1304
8.4%
s 1297
8.4%
l 1067
 
6.9%
o 1008
 
6.5%
t 667
 
4.3%
h 517
 
3.3%
Other values (16) 2943
19.1%
Uppercase Letter
ValueCountFrequency (%)
M 1128
30.9%
A 250
 
6.9%
J 215
 
5.9%
H 203
 
5.6%
S 180
 
4.9%
C 172
 
4.7%
E 166
 
4.6%
W 143
 
3.9%
B 140
 
3.8%
L 129
 
3.5%
Other values (15) 919
25.2%
Other Punctuation
ValueCountFrequency (%)
. 892
47.0%
, 891
46.9%
" 106
 
5.6%
' 9
 
0.5%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2735
100.0%
Close Punctuation
ValueCountFrequency (%)
) 144
100.0%
Open Punctuation
ValueCountFrequency (%)
( 144
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19091
79.5%
Common 4935
 
20.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1958
 
10.3%
e 1703
 
8.9%
a 1657
 
8.7%
i 1325
 
6.9%
n 1304
 
6.8%
s 1297
 
6.8%
M 1128
 
5.9%
l 1067
 
5.6%
o 1008
 
5.3%
t 667
 
3.5%
Other values (41) 5977
31.3%
Common
ValueCountFrequency (%)
2735
55.4%
. 892
 
18.1%
, 891
 
18.1%
) 144
 
2.9%
( 144
 
2.9%
" 106
 
2.1%
- 13
 
0.3%
' 9
 
0.2%
/ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24026
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Sex
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
male
577 
female
314 

Length

Max length6
Median length4
Mean length4.704826
Min length4

Characters and Unicode

Total characters4192
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%

Length

2023-11-11T12:05:44.722265image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:44.767874image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%

Most occurring characters

ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4192
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 4192
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Age
Real number (ℝ)

HIGH CORRELATION 

Distinct89
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.699118
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:44.809037image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile6
Q122
median29.699118
Q335
95-th percentile54
Maximum80
Range79.58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation13.002015
Coefficient of variation (CV)0.4377913
Kurtosis0.9662793
Mean29.699118
Median Absolute Deviation (MAD)6.3008824
Skewness0.43448809
Sum26461.914
Variance169.0524
MonotonicityNot monotonic
2023-11-11T12:05:45.030602image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29.69911765 177
 
19.9%
24 30
 
3.4%
22 27
 
3.0%
18 26
 
2.9%
28 25
 
2.8%
30 25
 
2.8%
19 25
 
2.8%
21 24
 
2.7%
25 23
 
2.6%
36 22
 
2.5%
Other values (79) 487
54.7%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
66 1
 
0.1%
65 3
0.3%
64 2
0.2%
63 2
0.2%
62 4
0.4%

SibSp
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52300786
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:45.072681image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1027434
Coefficient of variation (CV)2.1084644
Kurtosis17.88042
Mean0.52300786
Median Absolute Deviation (MAD)0
Skewness3.6953517
Sum466
Variance1.2160431
MonotonicityNot monotonic
2023-11-11T12:05:45.106451image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.1%
1 209
 
23.5%
0 608
68.2%

Parch
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38159371
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:45.142652image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80605722
Coefficient of variation (CV)2.1123441
Kurtosis9.7781252
Mean0.38159371
Median Absolute Deviation (MAD)0
Skewness2.749117
Sum340
Variance0.64972824
MonotonicityNot monotonic
2023-11-11T12:05:45.175433image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.4%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.2%
0 678
76.1%

Ticket
Text

Distinct681
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:45.270164image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.7508418
Min length3

Characters and Unicode

Total characters6015
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique547 ?
Unique (%)61.4%

Sample

1st rowA/5 21171
2nd rowPC 17599
3rd rowSTON/O2. 3101282
4th row113803
5th row373450
ValueCountFrequency (%)
pc 60
 
5.3%
c.a 27
 
2.4%
a/5 17
 
1.5%
ca 14
 
1.2%
ston/o 12
 
1.1%
2 12
 
1.1%
sc/paris 9
 
0.8%
w./c 9
 
0.8%
soton/o.q 8
 
0.7%
347082 7
 
0.6%
Other values (709) 955
84.5%
2023-11-11T12:05:45.431561image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4808
79.9%
Uppercase Letter 652
 
10.8%
Other Punctuation 295
 
4.9%
Space Separator 239
 
4.0%
Lowercase Letter 21
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 151
23.2%
O 100
15.3%
P 98
15.0%
A 82
12.6%
S 74
11.3%
N 40
 
6.1%
T 36
 
5.5%
W 16
 
2.5%
Q 15
 
2.3%
I 11
 
1.7%
Other values (6) 29
 
4.4%
Decimal Number
ValueCountFrequency (%)
3 746
15.5%
1 689
14.3%
2 594
12.4%
7 490
10.2%
4 464
9.7%
6 422
8.8%
0 406
8.4%
5 387
8.0%
9 328
6.8%
8 282
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
a 6
28.6%
s 5
23.8%
r 4
19.0%
i 4
19.0%
l 1
 
4.8%
e 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 197
66.8%
/ 98
33.2%
Space Separator
ValueCountFrequency (%)
239
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5342
88.8%
Latin 673
 
11.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 151
22.4%
O 100
14.9%
P 98
14.6%
A 82
12.2%
S 74
11.0%
N 40
 
5.9%
T 36
 
5.3%
W 16
 
2.4%
Q 15
 
2.2%
I 11
 
1.6%
Other values (12) 50
 
7.4%
Common
ValueCountFrequency (%)
3 746
14.0%
1 689
12.9%
2 594
11.1%
7 490
9.2%
4 464
8.7%
6 422
7.9%
0 406
7.6%
5 387
7.2%
9 328
6.1%
8 282
 
5.3%
Other values (3) 534
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6015
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Fare
Real number (ℝ)

ZEROS 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2023-11-11T12:05:45.496978image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2023-11-11T12:05:45.552097image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Embarked
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
S
646 
C
168 
Q
77 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowC
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 646
72.5%
C 168
 
18.9%
Q 77
 
8.6%

Length

2023-11-11T12:05:45.594726image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:45.636556image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
s 646
72.5%
c 168
 
18.9%
q 77
 
8.6%

Most occurring characters

ValueCountFrequency (%)
S 646
72.5%
C 168
 
18.9%
Q 77
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 891
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 646
72.5%
C 168
 
18.9%
Q 77
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 891
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 646
72.5%
C 168
 
18.9%
Q 77
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 646
72.5%
C 168
 
18.9%
Q 77
 
8.6%

Age_class
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)0.3%
Missing14
Missing (%)1.6%
Memory size1.1 KiB
성년
690 
미성년
165 
노년
 
22

Length

Max length3
Median length2
Mean length2.1881414
Min length2

Characters and Unicode

Total characters1919
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성년
2nd row성년
3rd row성년
4th row성년
5th row성년

Common Values

ValueCountFrequency (%)
성년 690
77.4%
미성년 165
 
18.5%
노년 22
 
2.5%
(Missing) 14
 
1.6%

Length

2023-11-11T12:05:45.674845image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:45.719475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
성년 690
78.7%
미성년 165
 
18.8%
노년 22
 
2.5%

Most occurring characters

ValueCountFrequency (%)
ë…„ 877
45.7%
성 855
44.6%
미 165
 
8.6%
ë…¸ 22
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1919
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ë…„ 877
45.7%
성 855
44.6%
미 165
 
8.6%
ë…¸ 22
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1919
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
ë…„ 877
45.7%
성 855
44.6%
미 165
 
8.6%
ë…¸ 22
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1919
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
ë…„ 877
45.7%
성 855
44.6%
미 165
 
8.6%
ë…¸ 22
 
1.1%

Age_class2
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
성년
304 
미성년
301 
노년
286 

Length

Max length3
Median length2
Mean length2.3378227
Min length2

Characters and Unicode

Total characters2083
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미성년
2nd row노년
3rd row성년
4th row노년
5th row노년

Common Values

ValueCountFrequency (%)
성년 304
34.1%
미성년 301
33.8%
노년 286
32.1%

Length

2023-11-11T12:05:45.756953image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-11T12:05:45.800878image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
성년 304
34.1%
미성년 301
33.8%
노년 286
32.1%

Most occurring characters

ValueCountFrequency (%)
ë…„ 891
42.8%
성 605
29.0%
미 301
 
14.5%
ë…¸ 286
 
13.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2083
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ë…„ 891
42.8%
성 605
29.0%
미 301
 
14.5%
ë…¸ 286
 
13.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2083
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
ë…„ 891
42.8%
성 605
29.0%
미 301
 
14.5%
ë…¸ 286
 
13.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2083
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
ë…„ 891
42.8%
성 605
29.0%
미 301
 
14.5%
ë…¸ 286
 
13.7%

Interactions

2023-11-11T12:05:43.811638image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:42.968365image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.174264image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.389804image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.603006image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.850177image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.006711image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.214376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.430029image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.642101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.893635image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.050839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.259589image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.474662image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.686091image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.937405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.094207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.305671image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.519244image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.730954image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.979773image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.135245image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.348120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.562300image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-11-11T12:05:43.772036image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-11-11T12:05:45.836069image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
PassengerIdAgeSibSpParchFareSurvivedPclassSexEmbarkedAge_classAge_class2
PassengerId1.0000.042-0.0610.001-0.0140.1040.0320.0660.0000.0000.024
Age0.0421.000-0.147-0.2170.1190.1580.2650.1060.1510.8290.930
SibSp-0.061-0.1471.0000.4500.4470.1870.1480.2060.0920.2400.204
Parch0.001-0.2170.4501.0000.4100.1570.0220.2470.0520.2200.211
Fare-0.0140.1190.4470.4101.0000.2830.4790.1890.1950.0630.130
Survived0.1040.1580.1870.1570.2831.0000.3370.5400.1640.0460.054
Pclass0.0320.2650.1480.0220.4790.3371.0000.1300.2580.1380.233
Sex0.0660.1060.2060.2470.1890.5400.1301.0000.1110.1010.064
Embarked0.0000.1510.0920.0520.1950.1640.2580.1111.0000.0000.157
Age_class0.0000.8290.2400.2200.0630.0460.1380.1010.0001.0000.509
Age_class20.0240.9300.2040.2110.1300.0540.2330.0640.1570.5091.000

Missing values

2023-11-11T12:05:44.041392image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-11T12:05:44.116046image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedAge_classAge_class2
0103Braund, Mr. Owen Harrismale22.00000010A/5 211717.2500S성년미성년
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.00000010PC 1759971.2833C성년노년
2313Heikkinen, Miss. Lainafemale26.00000000STON/O2. 31012827.9250S성년성년
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.0000001011380353.1000S성년노년
4503Allen, Mr. William Henrymale35.000000003734508.0500S성년노년
5603Moran, Mr. Jamesmale29.699118003308778.4583Q성년성년
6701McCarthy, Mr. Timothy Jmale54.000000001746351.8625S성년노년
7803Palsson, Master. Gosta Leonardmale2.0000003134990921.0750S미성년미성년
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.0000000234774211.1333S성년성년
91012Nasser, Mrs. Nicholas (Adele Achem)female14.0000001023773630.0708C미성년미성년
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedAge_classAge_class2
88188203Markun, Mr. Johannmale33.000000003492577.8958S성년노년
88288303Dahlberg, Miss. Gerda Ulrikafemale22.00000000755210.5167S성년미성년
88388402Banfield, Mr. Frederick Jamesmale28.00000000C.A./SOTON 3406810.5000S성년성년
88488503Sutehall, Mr. Henry Jrmale25.00000000SOTON/OQ 3920767.0500S성년미성년
88588603Rice, Mrs. William (Margaret Norton)female39.0000000538265229.1250Q성년노년
88688702Montvila, Rev. Juozasmale27.0000000021153613.0000S성년성년
88788811Graham, Miss. Margaret Edithfemale19.0000000011205330.0000S미성년미성년
88888903Johnston, Miss. Catherine Helen "Carrie"female29.69911812W./C. 660723.4500S성년성년
88989011Behr, Mr. Karl Howellmale26.0000000011136930.0000C성년성년
89089103Dooley, Mr. Patrickmale32.000000003703767.7500Q성년노년